Skip to content

Session stop status flicker by splitting tool boundary from turn boundary#114

Draft
DDKinger wants to merge 1 commit into
mainfrom
dev/yuazha/turn-scoped-status
Draft

Session stop status flicker by splitting tool boundary from turn boundary#114
DDKinger wants to merge 1 commit into
mainfrom
dev/yuazha/turn-scoped-status

Conversation

@DDKinger

Copy link
Copy Markdown
Contributor

What

A model turn typically calls many tools sequentially (e.g. prompt.submit → tool.starting → tool.finished → tool.starting → tool.finished → ... → stop). Under the old wire mapping, every agent.tool.finished routed to SessionEvent::ToolCompleted and the reducer demoted Working→Idle; the next agent.tool.starting re-promoted to Working. The F2 status badge flickered green↔grey at every tool boundary inside a single turn.

This PR splits the tool boundary from the turn boundary so the F2 status badge stays stable across an entire turn, and only transitions to Idle when the turn actually ends.

How

Add SessionEvent::TurnCompleted and re-scope ToolCompleted:

Event Old semantics New semantics
ToolCompleted Working/Attention → Idle; clear current_tool clear current_tool; Attention → Working (turn continues); Working stays Working
TurnCompleted (NEW) n/a Working/Attention → Idle; clear current_tool and attention_reason; Error sticky; Ended/Historical no-op

Wire mapping in route_agent_event_to_registry:

agent.tool.completed
agent.tool.finished
agent.tool.failed
agent.subagent.stop   →  ToolCompleted   (no demote)
agent.stop            →  TurnCompleted   (the turn boundary)

agent.subagent.stop moves from the agent.stop arm to the ToolCompleted arm: Claude's SubagentStop hook fires when a Task-tool sub-agent finishes; the main agent continues the turn. Folding it in with agent.stop was part of the original flicker problem.

Turn timeline — before vs after

Before (4-tool turn = 4 flickers):

prompt.submit   → Working
tool.starting   → Working
tool.finished   → Idle    ← flicker
tool.starting   → Working ← flicker
tool.finished   → Idle    ← flicker
... repeat ...
stop            → Idle

After:

prompt.submit   → Working (tool=prompt)
tool.starting   → Working (tool=ls)
tool.finished   → Working (tool=None)
tool.starting   → Working (tool=grep)
tool.finished   → Working (tool=None)
notification    → Attention (reason=...)
tool.completed  → Working (reason cleared, agent resumes)
tool.starting   → Working (tool=edit)
tool.finished   → Working
stop            → Idle

current_tool still updates at every tool boundary for transparency; only status is held stable.

Edge cases covered

  • Quick reply, no tools: prompt.submit → stop → Idle→Working→Idle. ✓
  • ask_user round-trip: submit → notif (Attention) → tool.completed (Working) → ... → stop (Idle). The user-input case where Attention resolves back to Working, not Idle. ✓
  • Mid-turn error: submit → ConnectionFailed → stop. Error is sticky; TurnCompleted does NOT clear it. Surfaces until next prompt cycle. ✓
  • Sub-agent: submit → tool.starting (Task) → subagent.stop → ... → stop. Subagent.stop no longer demotes — main agent continues seamlessly. ✓
  • Stale agent.stop on Ended/Historical row: resurrection guard returns no-op so it can't re-enable Enter/focus on a dead pane GUID. ✓

Files

File Change
agent_sessions.rs New SessionEvent::TurnCompleted; helper reducer split
session_registry.rs New SessionHookParams::TurnCompleted (wire serialization with #[serde(tag = "kind")]); master reducer split mirrors the helper
app.rs Wire mapping updated in route_agent_event_to_registry
master/mod.rs session_event_key recognizes TurnCompleted so title-refresh and other key-keyed paths see it

Tests

16 new tests, all passing; 560 total cargo test --bin wta.

Reducer (helper, agent_sessions.rs):

  • tool_completed_keeps_working_so_status_does_not_flicker_between_tools
  • turn_completed_returns_working_to_idle
  • multi_tool_turn_stays_working_until_turn_completed
  • tool_completed_demotes_attention_to_working_so_turn_continues
  • turn_completed_demotes_attention_to_idle_and_clears_reason
  • turn_completed_does_not_clear_error_state
  • tool_completed_does_not_clear_error_state
  • turn_completed_on_terminal_row_is_noop_resurrection_guard
  • tool_completed_after_ask_user_resumes_working_not_idle

Reducer (master, session_registry.rs):

  • master_reducer_multi_tool_turn_stays_working_until_turn_completed
  • master_reducer_turn_completed_does_not_resurrect_terminal_row
  • master_reducer_turn_completed_does_not_clear_error
  • Existing master_reducer_tool_lifecycle_and_notification_update_activity_fields updated to assert Attention→Working on ToolCompleted, Working→Idle on TurnCompleted.

End-to-end through route_agent_event_to_registry (app.rs):

  • route_multi_tool_turn_stays_working_until_agent_stop
  • route_subagent_stop_does_not_end_the_turn
  • route_ask_user_answer_resumes_working_not_idle

Wire compatibility

Old binaries (master/helpers) speaking the old session_hook ext-notification won't send TurnCompleted — they'll keep mapping agent.stop to ToolCompleted. With the new reducer, that means an old peer's agent.stop events will look like "tool just finished, turn continues" → the row stays Working forever from the new peer's POV until the user does something. This is observable but graceful (no crash, no data corruption). Mixed-version peers are not expected in normal deployment because all wta processes are deployed together inside the same MSIX package; calling this out for awareness rather than as a blocker.

…flickering

A model turn typically calls many tools sequentially. Under the old
wire mapping, every `agent.tool.finished` was routed to
`SessionEvent::ToolCompleted` and the reducer demoted
Working->Idle; the next `agent.tool.starting` re-promoted to
Working. The F2 status badge flickered green<->grey at every tool
boundary inside a single turn -- visible in hook-trace.log as
ToolStarting/ToolCompleted pairs every few hundred ms.

This split makes the wire vocabulary match the actual semantics:

* `SessionEvent::ToolCompleted { key }` is now ONLY a single-tool
  boundary. The reducer clears `current_tool` (for transparency on
  what's running right now) but does NOT demote Working->Idle.
  Special case: Attention->Working models `ask_user` (or a
  permission_prompt) -- the user just answered, the agent will
  resume the turn.

* `SessionEvent::TurnCompleted { key }` (NEW) is the turn-ending
  event. Demotes Working/Attention->Idle and clears `current_tool`
  and `attention_reason`. Honors the same Ended/Historical
  resurrection guard as the other lifecycle events. `Error` is
  intentionally sticky -- a `ConnectionFailed` from mid-turn should
  keep surfacing until the next prompt cycle.

Wire mapping changes in `route_agent_event_to_registry`:

  agent.tool.completed
  agent.tool.finished
  agent.tool.failed
  agent.subagent.stop   -> ToolCompleted   (no demote)
  agent.stop            -> TurnCompleted   (the turn boundary)

`agent.subagent.stop` moves from the `agent.stop` arm to the
`ToolCompleted` arm: Claude's SubagentStop hook fires when a
Task-tool sub-agent finishes; the main agent continues the turn.
Folding it in with `agent.stop` was part of the original flicker
problem.

Touched:

* agent_sessions.rs -- new `SessionEvent::TurnCompleted`; helper
  reducer split (ToolCompleted no longer demotes; TurnCompleted does).
* session_registry.rs -- new `SessionHookParams::TurnCompleted` (wire
  serialization with `#[serde(tag = "kind")]`); master reducer split
  mirrors the helper.
* app.rs -- wire mapping updated.
* master/mod.rs -- `session_event_key` recognizes TurnCompleted so
  title-refresh and other key-keyed paths see it.

Tests (16 new, all passing; 560 total `cargo test --bin wta`):

* agent_sessions:
  - tool_completed_keeps_working_so_status_does_not_flicker_between_tools
  - turn_completed_returns_working_to_idle
  - multi_tool_turn_stays_working_until_turn_completed
  - tool_completed_demotes_attention_to_working_so_turn_continues
  - turn_completed_demotes_attention_to_idle_and_clears_reason
  - turn_completed_does_not_clear_error_state
  - tool_completed_does_not_clear_error_state
  - turn_completed_on_terminal_row_is_noop_resurrection_guard
  - tool_completed_after_ask_user_resumes_working_not_idle
* session_registry (master reducer mirror):
  - master_reducer_multi_tool_turn_stays_working_until_turn_completed
  - master_reducer_turn_completed_does_not_resurrect_terminal_row
  - master_reducer_turn_completed_does_not_clear_error
  - Updated master_reducer_tool_lifecycle_and_notification_update_activity_fields
    to assert Attention->Working on ToolCompleted, Working->Idle on
    TurnCompleted.
* app (end-to-end via route_agent_event_to_registry):
  - route_multi_tool_turn_stays_working_until_agent_stop
  - route_subagent_stop_does_not_end_the_turn
  - route_ask_user_answer_resumes_working_not_idle

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 29, 2026 07:47
@DDKinger DDKinger changed the title feat(wta): stop F2 status flicker by splitting tool boundary from turn boundary Session stop status flicker by splitting tool boundary from turn boundary May 29, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refines WTA’s agent-session activity modeling by separating “tool boundary” events from the actual “turn boundary,” so the F2 status badge stays stable (Working/Attention) throughout a multi-tool model turn and only transitions to Idle when the turn truly ends.

Changes:

  • Introduces SessionEvent::TurnCompleted and updates reducers so ToolCompleted no longer demotes Working→Idle mid-turn.
  • Updates routing (route_agent_event_to_registry) so agent.stop maps to TurnCompleted, while tool lifecycle + agent.subagent.stop map to ToolCompleted.
  • Extends master-side plumbing (wire params + key extraction) and adds/updates tests to enforce “no flicker within a turn.”

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
tools/wta/src/agent_sessions.rs Adds TurnCompleted and adjusts the helper-side reducer + tests to keep Working stable across tool boundaries.
tools/wta/src/session_registry.rs Adds wire support for TurnCompleted and mirrors the reducer behavior on the master registry side with new tests.
tools/wta/src/app.rs Updates event routing so agent.stop is the sole turn boundary demoting to Idle; adds end-to-end routing tests.
tools/wta/src/master/mod.rs Updates event key extraction to include TurnCompleted and extends tests accordingly.
Comments suppressed due to low confidence (2)

tools/wta/src/agent_sessions.rs:499

  • TurnCompleted’s doc/comment say it clears turn-scoped scratch fields (attention_reason/current_tool), but the reducer only clears attention_reason when demoting Working|Attention -> Idle. If the session is in Error (which is intentionally sticky), a stale attention_reason can persist across turn end and be propagated/serialized even though the turn has ended. Clearing attention_reason unconditionally (after the Ended/Historical guard) keeps the state consistent with the docs and avoids leaking stale “needs input” messages into later UI.
                    let demotable = entry.status == AgentStatus::Working
                        || entry.status == AgentStatus::Attention;
                    if demotable {
                        entry.status            = AgentStatus::Idle;
                        entry.attention_reason  = None;

tools/wta/src/session_registry.rs:1146

  • Master reducer TurnCompleted claims to clear turn-scoped scratch fields, but currently only clears attention_reason when demoting Working|Attention -> Idle. If the row is in Error, attention_reason can remain set and be sent over the wire even though the turn ended. Clear attention_reason unconditionally after the Ended/Historical guard to keep state consistent and avoid stale prompts lingering behind an Error state.
            if matches!(entry.status, Some(AgentStatus::Working | AgentStatus::Attention)) {
                entry.status = Some(AgentStatus::Idle);
                entry.attention_reason = None;
            }
            entry.current_tool = None;

@github-actions

Copy link
Copy Markdown

@check-spelling-bot Report

⚠️ Dictionary not found

Problems were encountered retrieving check dictionaries (cspell:python/src/common/extra.txt cspell:cpp/src/lang-jargon.txt cspell:cpp/src/stdlib-cmath.txt cspell:cpp/src/compiler-msvc.txt cspell:cpp/src/stdlib-c.txt cspell:public-licenses/src/generated/public-licenses.txt cspell:haskell/dict/haskell.txt cspell:python/src/additional_words.txt cspell:svelte/dict/svelte.txt cspell:java/src/java-terms.txt cspell:public-licenses/src/additional-licenses.txt cspell:scala/dict/scala.txt cspell:css/dict/css.txt cspell:cpp/src/ecosystem.txt cspell:ada/dict/ada.txt cspell:r/src/r.txt cspell:django/dict/django.txt cspell:python/src/python/python.txt cspell:cpp/src/stdlib-cerrno.txt cspell:elixir/dict/elixir.txt cspell:dart/src/dart.txt cspell:gaming-terms/dict/gaming-terms.txt cspell:sql/src/sql.txt cspell:docker/src/docker-words.txt cspell:k8s/dict/k8s.txt cspell:html/dict/html.txt cspell:cpp/src/people.txt cspell:golang/dict/go.txt cspell:typescript/dict/typescript.txt cspell:python/src/python/python-lib.txt cspell:dotnet/dict/dotnet.txt cspell:clojure/src/clojure.txt cspell:lua/dict/lua.txt cspell:node/dict/node.txt cspell:npm/dict/npm.txt cspell:shell/dict/shell-all-words.txt cspell:cpp/src/stdlib-cpp.txt cspell:redis/dict/redis.txt cspell:cpp/src/template-strings.txt cspell:ruby/dict/ruby.txt cspell:latex/dict/latex.txt cspell:sql/src/tsql.txt cspell:software-terms/dict/softwareTerms.txt cspell:cpp/src/compiler-clang-attributes.txt cspell:monkeyc/src/monkeyc_keywords.txt cspell:fullstack/dict/fullstack.txt cspell:software-terms/dict/webServices.txt cspell:swift/src/swift.txt cspell:php/dict/php.txt cspell:rust/dict/rust.txt cspell:java/src/java.txt cspell:cpp/src/compiler-gcc.txt cspell:cpp/src/lang-keywords.txt cspell:powershell/dict/powershell.txt).

⚠️ For more information, see check-dictionary-not-found.

🔴 Please review

See the 📂 files view, the 📜action log, 👼 SARIF report, or 📝 job summary for details.

❌ Errors and Warnings Count
⚠️ binary-file 6
⚠️ check-dictionary-not-found 53
❌ forbidden-pattern 13
⚠️ ignored-expect-variant 1
⚠️ noisy-file 7
⚠️ single-line-file 1

See ❌ Event descriptions for more information.

These words are not needed and should be removed Backgrounder Ccc cplusplus ctl Debian dotnet drv endptr EOFs evt Fullwidth gitlab hdr idl IME inbox intelligentterminal Ioctl KVM lbl lld lsb NONINFRINGEMENT notif oss outdir Podcast pri prioritization PSobject rcv segfault Signtool sourced SWP Tbl testname transitioning unk unparseable unregisters Virt VMs VTE webpage websites WTCLI xsi

Some files were automatically ignored 🙈

These sample patterns would exclude them:

^\.dotnet\/\.dotnet\/TelemetryStorageService/
^\Q.dotnet/.dotnet/.workloadAdvertisingManifestSentinel10.0.200\E$
^\Q.dotnet/.dotnet/10.0.201.aspNetCertificateSentinel\E$
^\Q.dotnet/.dotnet/10.0.201.dotnetFirstUseSentinel\E$
^\Q.dotnet/.dotnet/10.0.201.toolpath.sentinel\E$
^\Qinstaller/bootstrap/target/.rustc_info.json\E$
^copilot-version\.err$
^copilot-version\.out$

You should consider excluding directory paths (e.g. (?:^|/)vendor/), filenames (e.g. (?:^|/)yarn\.lock$), or file extensions (e.g. \.gz$)

You should consider adding them to:

.github/actions/spelling/excludes.txt

File matching is via Perl regular expressions.

To check these files, more of their words need to be in the dictionary than not. You can use patterns.txt to exclude portions, add items to the dictionary (e.g. by adding them to allow.txt), or fix typos.

To update file exclusions and remove the previously acknowledged and now absent words, you could run the following commands

... in a clone of the git@github.com:microsoft/intelligent-terminal.git repository
on the dev/yuazha/turn-scoped-status branch (ℹ️ how do I use this?):

curl -s -S -L 'https://raw.githubusercontent.com/check-spelling/check-spelling/cfb6f7e75bbfc89c71eaa30366d0c166f1bd9c8c/apply.pl' |
perl - 'https://github.com/microsoft/intelligent-terminal/actions/runs/26625092202/attempts/1' &&
git commit -m 'Update check-spelling metadata'
Available 📚 dictionaries could cover words (expected and unrecognized) not in the 📘 dictionary

This includes both expected items (2062) from .github/actions/spelling/expect/alphabet.txt .github/actions/spelling/expect/expect.txt .github/actions/spelling/expect/web.txt

Dictionary Entries Covers Uniquely
cspell:csharp/csharp.txt 32 2 2
cspell:aws/aws.txt 232 2 2
cspell:fonts/fonts.txt 536 1 1

Consider adding to the extra_dictionaries array (in the .github/actions/spelling/config.json file):

    "cspell:csharp/csharp.txt",
    "cspell:aws/aws.txt",
    "cspell:fonts/fonts.txt",

To stop checking additional dictionaries, put (in the .github/actions/spelling/config.json file):

"check_extra_dictionaries": []
Forbidden patterns 🙅 (8)

In order to address this, you could change the content to not match the forbidden patterns (comments before forbidden patterns may help explain why they're forbidden), add patterns for acceptable instances, or adjust the forbidden patterns themselves.

These forbidden patterns matched content:

Should be nonexistent
\b[Nn]o[nt][- ]existent\b
Should probably be Otherwise,
(?<=\. )Otherwise\s
Should be preexisting
[Pp]re[- ]existing
Complete sentences in parentheticals should not have a space before the period.
\s\.\)(?!.*\}\})
Should be ; otherwise or . Otherwise

https://study.com/learn/lesson/otherwise-in-a-sentence.html

, [Oo]therwise\b
Should be reentrant
[Rr]e[- ]entrant
Should be whether or not ...
(?i)\b(?:whe|ra)ther(?:\s\w+)+ or not\.
Should be WinGet
\bWinget\b
✏️ Contributor please read this

By default the command suggestion will generate a file named based on your commit. That's generally ok as long as you add the file to your commit. Someone can reorganize it later.

If the listed items are:

  • ... misspelled, then please correct them instead of using the command.
  • ... names, please add them to .github/actions/spelling/allow/names.txt.
  • ... APIs, you can add them to a file in .github/actions/spelling/allow/.
  • ... just things you're using, please add them to an appropriate file in .github/actions/spelling/expect/.
  • ... tokens you only need in one place and shouldn't generally be used, you can add an item in an appropriate file in .github/actions/spelling/patterns/.

See the README.md in each directory for more information.

🔬 You can test your commits without appending to a PR by creating a new branch with that extra change and pushing it to your fork. The check-spelling action will run in response to your push -- it doesn't require an open pull request. By using such a branch, you can limit the number of typos your peers see you make. 😉

If the flagged items are 🤯 false positives

If items relate to a ...

  • binary file (or some other file you wouldn't want to check at all).

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

  • well-formed pattern.

    If you can write a pattern that would match it,
    try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

@DDKinger DDKinger marked this pull request as draft May 31, 2026 10:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants